Fix the issue where list user throws an error when executed concurrently with create user.#17583
Conversation
…ently with `create user`.
7306213 to
fca1809
Compare
CRZbulabula
left a comment
There was a problem hiding this comment.
Code Review
Overview
This PR fixes a ConcurrentModificationException that occurs when list user is executed concurrently with create user / drop user. The root cause is that listAllEntities() and listAllEntitiesInfo() in BasicRoleManager iterate entityMap (a HashMap) without locking, while concurrent createUser / deleteEntity calls modify the map under per-entity HashLock. HashMap iterators are fail-fast and throw ConcurrentModificationException on structural modification.
The fix replaces HashMap with ConcurrentHashMap, whose iterators are weakly consistent and never throw CME.
Analysis
The fix is correct. ConcurrentHashMap is the right solution here because:
listAllEntities()(line 235) andlistAllEntitiesInfo()(line 243) iterate the map without any lock- The existing
HashLockis per-entity-name, not a global map lock, so it can't protect iteration - Adding a global read/write lock would be more invasive and could hurt write throughput
- Weakly consistent iteration (seeing some but not all concurrent modifications) is acceptable for a
LIST USERSoperation
Inheritance covers both managers. BasicUserManager extends BasicRoleManager and calls super(accessor), so the ConcurrentHashMap change applies to the user manager too. The unprotected entityMap.entrySet() iteration in BasicUserManager:135 (getEntity(long entityId)) also benefits from this fix.
Issues & Suggestions
1. Test: Missing timeout on barrier.await() (minor risk)
If one thread fails before reaching the barrier, the other thread blocks indefinitely. The 30-second pool termination timeout would eventually unblock, but an explicit barrier timeout is cleaner:
barrier.await(10, TimeUnit.SECONDS);2. Test: Consider adding a message to assertTrue
The assertTrue(pool.awaitTermination(30, ...)) will cause a generic assertion failure. A descriptive message would help debugging:
assertTrue("Threads did not finish in 30s", pool.awaitTermination(30, TimeUnit.SECONDS));3. Pre-existing issue (not in scope, but worth noting): reset() is not thread-safe
BasicRoleManager.reset() (line 219-229) does entityMap.clear() then loads entities — even with ConcurrentHashMap, concurrent reads during reset() can see an empty or partial map. This is likely only called during initialization so it's low risk, but it's worth noting.
Summary
| Aspect | Assessment |
|---|---|
| Correctness | ✅ ConcurrentHashMap is the right fix |
| Scope | ✅ Minimal change targeting the root cause |
| Test coverage | ✅ Reproduces the concurrent scenario well |
| Performance | ✅ No concern |
| Risk | ✅ Low — drop-in replacement for the usage patterns |
Overall looks good. The core fix is correct and well-scoped. Consider the minor test robustness improvements above.
Fix the issue where
list userthrows an error when executed concurrently withcreate user.